Overview

Dataset statistics

Number of variables24
Number of observations367705
Missing cells2191772
Missing cells (%)24.8%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory78.2 MiB
Average record size in memory223.0 B

Variable types

CAT11
NUM8
DATE3
BOOL2

Warnings

TIPOEMAIL has a high cardinality: 24651 distinct values High cardinality
USU_TELF has a high cardinality: 23626 distinct values High cardinality
IP_Country has a high cardinality: 82 distinct values High cardinality
USU_CIIU has a high cardinality: 595 distinct values High cardinality
N_sesiones is highly correlated with Ficha BásicaHigh correlation
Ficha Básica is highly correlated with N_sesionesHigh correlation
IP_Area is highly correlated with IP_CountryHigh correlation
IP_Country is highly correlated with IP_AreaHigh correlation
CANAL_REGISTRO has 7534 (2.0%) missing values Missing
IP_Country has 21764 (5.9%) missing values Missing
IP_Area has 21764 (5.9%) missing values Missing
USU_TIPO has 283591 (77.1%) missing values Missing
USU_TAMANIO has 283589 (77.1%) missing values Missing
USU_CIIU has 283589 (77.1%) missing values Missing
USU_ESTADO has 283589 (77.1%) missing values Missing
USU_DEPARTAMENTO has 277197 (75.4%) missing values Missing
FEC_CLIENTE has 365092 (99.3%) missing values Missing
FEC_ALTA has 363994 (99.0%) missing values Missing
Ficha Básica is highly skewed (γ1 = 223.0141021) Skewed
N_logins is highly skewed (γ1 = 77.6662565) Skewed
N_sesiones is highly skewed (γ1 = 218.6421765) Skewed
IDUSUARIO has unique values Unique
BONDAD_EMAIL has 54067 (14.7%) zeros Zeros
IPCASOS has 15822 (4.3%) zeros Zeros
Ficha Básica has 250633 (68.2%) zeros Zeros
Perfil Promocional has 24290 (6.6%) zeros Zeros
N_logins has 170625 (46.4%) zeros Zeros

Reproduction

Analysis started2022-05-15 02:47:43.563358
Analysis finished2022-05-15 02:48:59.998141
Duration1 minute and 16.43 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

IDUSUARIO
Categorical

UNIQUE

Distinct367705
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size2.8 MiB
7802524
 
1
7398341
 
1
7566377
 
1
7604706
 
1
7283966
 
1
Other values (367700)
367700 
ValueCountFrequency (%) 
78025241< 0.1%
 
73983411< 0.1%
 
75663771< 0.1%
 
76047061< 0.1%
 
72839661< 0.1%
 
78853791< 0.1%
 
74559151< 0.1%
 
79417151< 0.1%
 
77039271< 0.1%
 
71632181< 0.1%
 
Other values (367695)367695> 99.9%
 
2022-05-15T04:49:02.701814image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique367705 ?
Unique (%)100.0%
2022-05-15T04:49:03.051228image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length7
Median length7
Mean length7
Min length7

TIPOUSUARIO
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.8 MiB
PF
265760 
PJ
89824 
PX
 
12121
ValueCountFrequency (%) 
PF26576072.3%
 
PJ8982424.4%
 
PX121213.3%
 
2022-05-15T04:49:03.370906image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2022-05-15T04:49:03.582139image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:49:03.763181image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length2
Median length2
Mean length2
Min length2
Distinct730
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size2.8 MiB
Minimum2018-01-01 00:00:00
Maximum2019-12-31 00:00:00
2022-05-15T04:49:04.105823image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:49:04.498574image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

CANAL_REGISTRO
Real number (ℝ≥0)

MISSING

Distinct8
Distinct (%)< 0.1%
Missing7534
Missing (%)2.0%
Infinite0
Infinite (%)0.0%
Mean3.896654645
Minimum1
Maximum9
Zeros0
Zeros (%)0.0%
Memory size2.8 MiB
2022-05-15T04:49:04.847769image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median3
Q37
95-th percentile8
Maximum9
Range8
Interquartile range (IQR)5

Descriptive statistics

Standard deviation2.412749497
Coefficient of variation (CV)0.6191848436
Kurtosis-0.9092001722
Mean3.896654645
Median Absolute Deviation (MAD)1
Skewness0.7642777622
Sum1403462
Variance5.821360133
MonotocityNot monotonic
2022-05-15T04:49:05.130750image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%) 
311893932.3%
 
28462923.0%
 
84771013.0%
 
73715710.1%
 
1364619.9%
 
4163574.4%
 
6121813.3%
 
967371.8%
 
(Missing)75342.0%
 
ValueCountFrequency (%) 
1364619.9%
 
28462923.0%
 
311893932.3%
 
4163574.4%
 
6121813.3%
 
ValueCountFrequency (%) 
967371.8%
 
84771013.0%
 
73715710.1%
 
6121813.3%
 
4163574.4%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.8 MiB
0
365086 
1
 
2619
ValueCountFrequency (%) 
036508699.3%
 
126190.7%
 
2022-05-15T04:49:05.359390image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

IND_ALTA
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.8 MiB
0
363995 
1
 
3710
ValueCountFrequency (%) 
036399599.0%
 
137101.0%
 
2022-05-15T04:49:05.439000image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

TIPOEMAIL
Categorical

HIGH CARDINALITY

Distinct24651
Distinct (%)6.7%
Missing0
Missing (%)0.0%
Memory size2.8 MiB
gmail.com
159897 
hotmail.com
121607 
yahoo.com
 
6503
yahoo.es
 
5239
outlook.com
 
4884
Other values (24646)
69575 
ValueCountFrequency (%) 
gmail.com15989743.5%
 
hotmail.com12160733.1%
 
yahoo.com65031.8%
 
yahoo.es52391.4%
 
outlook.com48841.3%
 
hotmail.es36381.0%
 
yopmail.com33660.9%
 
misena.edu.co27920.8%
 
outlook.es17640.5%
 
live.com10620.3%
 
Other values (24641)5695315.5%
 
2022-05-15T04:49:05.817069image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique19738 ?
Unique (%)5.4%
2022-05-15T04:49:06.248584image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length41
Median length10
Mean length10.40228172
Min length4

BONDAD_EMAIL
Real number (ℝ)

ZEROS

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean13.97477869
Minimum-20
Maximum20
Zeros54067
Zeros (%)14.7%
Memory size2.8 MiB
2022-05-15T04:49:06.564666image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum-20
5-th percentile-10
Q19
median20
Q320
95-th percentile20
Maximum20
Range40
Interquartile range (IQR)11

Descriptive statistics

Standard deviation11.07176257
Coefficient of variation (CV)0.7922674709
Kurtosis1.376678552
Mean13.97477869
Median Absolute Deviation (MAD)0
Skewness-1.609586954
Sum5138596
Variance122.5839265
MonotocityNot monotonic
2022-05-15T04:49:07.045892image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%) 
2027496074.8%
 
05406714.7%
 
-10168254.6%
 
-20120513.3%
 
149441.3%
 
948581.3%
 
ValueCountFrequency (%) 
-20120513.3%
 
-10168254.6%
 
05406714.7%
 
149441.3%
 
948581.3%
 
ValueCountFrequency (%) 
2027496074.8%
 
948581.3%
 
149441.3%
 
05406714.7%
 
-10168254.6%
 

USU_TELF
Categorical

HIGH CARDINALITY

Distinct23626
Distinct (%)6.4%
Missing66
Missing (%)< 0.1%
Memory size2.8 MiB
174XXXXX
 
926
44XXXXX
 
747
145XXXXX
 
711
32XXXXX
 
711
1174XXXXX
 
685
Other values (23621)
363859 
ValueCountFrequency (%) 
174XXXXX9260.3%
 
44XXXXX7470.2%
 
145XXXXX7110.2%
 
32XXXXX7110.2%
 
1174XXXXX6850.2%
 
444XXXXX6730.2%
 
0544XXXXX6670.2%
 
33XXXXX6330.2%
 
74XXXXX6280.2%
 
175XXXXX6190.2%
 
Other values (23616)36063998.1%
 
2022-05-15T04:49:07.715667image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique9628 ?
Unique (%)2.6%
2022-05-15T04:49:08.158442image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length20
Median length10
Mean length9.636969854
Min length3

IPCASOS
Real number (ℝ≥0)

ZEROS

Distinct277
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean351.3182986
Minimum0
Maximum16393
Zeros15822
Zeros (%)4.3%
Memory size2.8 MiB
2022-05-15T04:49:08.549380image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q11
median2
Q36
95-th percentile1708
Maximum16393
Range16393
Interquartile range (IQR)5

Descriptive statistics

Standard deviation1692.108796
Coefficient of variation (CV)4.816455057
Kurtosis46.4272444
Mean351.3182986
Median Absolute Deviation (MAD)1
Skewness6.491427918
Sum129181495
Variance2863232.178
MonotocityNot monotonic
2022-05-15T04:49:08.964973image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
116436344.7%
 
24619612.6%
 
3238916.5%
 
0158224.3%
 
4143643.9%
 
598552.7%
 
669351.9%
 
755171.5%
 
843871.2%
 
934200.9%
 
Other values (267)7295519.8%
 
ValueCountFrequency (%) 
0158224.3%
 
116436344.7%
 
24619612.6%
 
3238916.5%
 
4143643.9%
 
ValueCountFrequency (%) 
163939880.3%
 
1302313780.4%
 
1299810370.3%
 
1076213460.4%
 
711412320.3%
 

IP_Country
Categorical

HIGH CARDINALITY
HIGH CORRELATION
MISSING

Distinct82
Distinct (%)< 0.1%
Missing21764
Missing (%)5.9%
Memory size2.8 MiB
Colombia
340368 
United States
 
2703
Peru
 
436
Venezuela
 
297
Spain
 
275
Other values (77)
 
1862
ValueCountFrequency (%) 
Colombia34036892.6%
 
United States27030.7%
 
Peru4360.1%
 
Venezuela2970.1%
 
Spain2750.1%
 
Brazil2180.1%
 
Argentina2110.1%
 
Ecuador1840.1%
 
Chile165< 0.1%
 
Mexico157< 0.1%
 
Other values (72)9270.3%
 
(Missing)217645.9%
 
2022-05-15T04:49:09.434709image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique23 ?
Unique (%)< 0.1%
2022-05-15T04:49:09.831600image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length33
Median length8
Mean length7.730797786
Min length3

IP_Area
Categorical

HIGH CORRELATION
MISSING

Distinct5
Distinct (%)< 0.1%
Missing21764
Missing (%)5.9%
Memory size2.8 MiB
America
345170 
Europa
 
621
Asia
 
113
Oceania
 
34
Africa
 
3
ValueCountFrequency (%) 
America34517093.9%
 
Europa6210.2%
 
Asia113< 0.1%
 
Oceania34< 0.1%
 
Africa3< 0.1%
 
(Missing)217645.9%
 
2022-05-15T04:49:10.175159image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2022-05-15T04:49:10.400924image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:49:10.641664image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length7
Median length7
Mean length6.760626045
Min length3

USU_TIPO
Categorical

MISSING

Distinct9
Distinct (%)< 0.1%
Missing283591
Missing (%)77.1%
Memory size2.8 MiB
EMPRESARIO INDIVIDUAL
39912 
SOCIEDAD COMERCIAL/INDUSTRIAL
37840 
ENTIDAD FINANCIERA O DE SEGUROS
 
2509
ENTIDAD SIN ANIMO DE LUCRO
 
2479
ORGANISMO ESTATAL
 
691
Other values (4)
 
683
ValueCountFrequency (%) 
EMPRESARIO INDIVIDUAL3991210.9%
 
SOCIEDAD COMERCIAL/INDUSTRIAL3784010.3%
 
ENTIDAD FINANCIERA O DE SEGUROS25090.7%
 
ENTIDAD SIN ANIMO DE LUCRO24790.7%
 
ORGANISMO ESTATAL6910.2%
 
HOLDING3930.1%
 
ENTIDAD EXTRANJERA2750.1%
 
SOCIEDAD NO COMERCIAL13< 0.1%
 
INDUSTRIA / COMERCIO2< 0.1%
 
(Missing)28359177.1%
 
2022-05-15T04:49:11.012750image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2022-05-15T04:49:11.262594image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:49:11.590467image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length31
Median length3
Mean length8.018055234
Min length3

USU_TAMANIO
Categorical

MISSING

Distinct5
Distinct (%)< 0.1%
Missing283589
Missing (%)77.1%
Memory size2.8 MiB
MC
60626 
PQ
10113 
GR
 
5822
MD
 
5739
SD
 
1816
ValueCountFrequency (%) 
MC6062616.5%
 
PQ101132.8%
 
GR58221.6%
 
MD57391.6%
 
SD18160.5%
 
(Missing)28358977.1%
 
2022-05-15T04:49:11.920045image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2022-05-15T04:49:12.149572image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:49:12.362998image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length3
Median length3
Mean length2.771240532
Min length2

USU_CIIU
Categorical

HIGH CARDINALITY
MISSING

Distinct595
Distinct (%)0.7%
Missing283589
Missing (%)77.1%
Memory size2.8 MiB
G4711
 
2590
I5611
 
2004
M7110
 
1940
M7020
 
1717
G4771
 
1609
Other values (590)
74256 
ValueCountFrequency (%) 
G471125900.7%
 
I561120040.5%
 
M711019400.5%
 
M702017170.5%
 
G477116090.4%
 
G477315530.4%
 
N829915300.4%
 
I563015270.4%
 
F429014640.4%
 
S949914410.4%
 
Other values (585)6674118.2%
 
(Missing)28358977.1%
 
2022-05-15T04:49:12.751914image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique69 ?
Unique (%)0.1%
2022-05-15T04:49:13.372632image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length5
Median length3
Mean length3.456112917
Min length3

USU_ESTADO
Categorical

MISSING

Distinct11
Distinct (%)< 0.1%
Missing283589
Missing (%)77.1%
Memory size2.8 MiB
ACTIVA
67580 
CANCELACIÓN
13076 
LIQUIDACION
 
2362
LEY DE INSOLVENCIA (REORGANIZACION EMPRESARIAL)
 
500
EXTINGUIDA
 
275
Other values (6)
 
323
ValueCountFrequency (%) 
ACTIVA6758018.4%
 
CANCELACIÓN130763.6%
 
LIQUIDACION23620.6%
 
LEY DE INSOLVENCIA (REORGANIZACION EMPRESARIAL)5000.1%
 
EXTINGUIDA2750.1%
 
INACTIVA TEMPORAL2610.1%
 
REESTRUCTURACION O CONCORDATO32< 0.1%
 
INTERVENIDA9< 0.1%
 
COINCIDENCIA HOMOGRAFA LISTA CLINTON (SDNT OFAC)9< 0.1%
 
SALIDA CLINTON (SDNT OFAC)7< 0.1%
 
(Missing)28358977.1%
 
2022-05-15T04:49:13.714698image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2022-05-15T04:49:14.042123image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length48
Median length3
Mean length3.966489441
Min length3

USU_DEPARTAMENTO
Categorical

MISSING

Distinct33
Distinct (%)< 0.1%
Missing277197
Missing (%)75.4%
Memory size2.8 MiB
BOGOTA
33307 
ANTIOQUIA
11168 
VALLE
7312 
CUNDINAMARCA
4933 
ATLANTICO
4284 
Other values (28)
29504 
ValueCountFrequency (%) 
BOGOTA333079.1%
 
ANTIOQUIA111683.0%
 
VALLE73122.0%
 
CUNDINAMARCA49331.3%
 
ATLANTICO42841.2%
 
SANTANDER38881.1%
 
BOYACA21700.6%
 
BOLIVAR21000.6%
 
RISARALDA20850.6%
 
NORTE SANTANDER20040.5%
 
Other values (23)172574.7%
 
(Missing)27719775.4%
 
2022-05-15T04:49:14.441312image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique2 ?
Unique (%)< 0.1%
2022-05-15T04:49:14.803592image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length15
Median length3
Mean length4.036072395
Min length3

FEC_CLIENTE
Date

MISSING

Distinct810
Distinct (%)31.0%
Missing365092
Missing (%)99.3%
Memory size2.8 MiB
Minimum2018-01-02 00:00:00
Maximum2021-04-01 00:00:00
2022-05-15T04:49:15.200163image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:49:15.748975image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

FEC_ALTA
Date

MISSING

Distinct902
Distinct (%)24.3%
Missing363994
Missing (%)99.0%
Memory size2.8 MiB
Minimum2018-01-02 00:00:00
Maximum2021-11-01 00:00:00
2022-05-15T04:49:16.363652image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:49:17.072708image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Ficha Básica
Real number (ℝ≥0)

HIGH CORRELATION
SKEWED
ZEROS

Distinct154
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.011647924
Minimum0
Maximum3206
Zeros250633
Zeros (%)68.2%
Memory size2.8 MiB
2022-05-15T04:49:17.632603image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile5
Maximum3206
Range3206
Interquartile range (IQR)1

Descriptive statistics

Standard deviation9.103286641
Coefficient of variation (CV)8.998473107
Kurtosis64629.21345
Mean1.011647924
Median Absolute Deviation (MAD)0
Skewness223.0141021
Sum371988
Variance82.86982767
MonotocityNot monotonic
2022-05-15T04:49:18.133518image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
025063368.2%
 
15452914.8%
 
2221486.0%
 
3112173.1%
 
473272.0%
 
565251.8%
 
652051.4%
 
726880.7%
 
817330.5%
 
910940.3%
 
Other values (144)46061.3%
 
ValueCountFrequency (%) 
025063368.2%
 
15452914.8%
 
2221486.0%
 
3112173.1%
 
473272.0%
 
ValueCountFrequency (%) 
32061< 0.1%
 
23421< 0.1%
 
20621< 0.1%
 
16591< 0.1%
 
10851< 0.1%
 

Perfil Promocional
Real number (ℝ≥0)

ZEROS

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.153185298
Minimum0
Maximum5
Zeros24290
Zeros (%)6.6%
Memory size2.8 MiB
2022-05-15T04:49:18.522750image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median1
Q31
95-th percentile3
Maximum5
Range5
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.8747251344
Coefficient of variation (CV)0.758529558
Kurtosis11.63350115
Mean1.153185298
Median Absolute Deviation (MAD)0
Skewness3.293654164
Sum424032
Variance0.7651440608
MonotocityNot monotonic
2022-05-15T04:49:19.021496image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%) 
131323685.2%
 
0242906.6%
 
5128463.5%
 
289912.4%
 
347841.3%
 
435581.0%
 
ValueCountFrequency (%) 
0242906.6%
 
131323685.2%
 
289912.4%
 
347841.3%
 
435581.0%
 
ValueCountFrequency (%) 
5128463.5%
 
435581.0%
 
347841.3%
 
289912.4%
 
131323685.2%
 

N_logins
Real number (ℝ≥0)

SKEWED
ZEROS

Distinct161
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.676974749
Minimum0
Maximum1307
Zeros170625
Zeros (%)46.4%
Memory size2.8 MiB
2022-05-15T04:49:19.468400image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q32
95-th percentile7
Maximum1307
Range1307
Interquartile range (IQR)2

Descriptive statistics

Standard deviation4.868763107
Coefficient of variation (CV)2.903301383
Kurtosis15831.76257
Mean1.676974749
Median Absolute Deviation (MAD)1
Skewness77.6662565
Sum616632
Variance23.70485419
MonotocityNot monotonic
2022-05-15T04:49:19.893410image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
017062546.4%
 
18109722.1%
 
24102911.2%
 
3237246.5%
 
4150594.1%
 
5102282.8%
 
668751.9%
 
747841.3%
 
834350.9%
 
924400.7%
 
Other values (151)84092.3%
 
ValueCountFrequency (%) 
017062546.4%
 
18109722.1%
 
24102911.2%
 
3237246.5%
 
4150594.1%
 
ValueCountFrequency (%) 
13071< 0.1%
 
5591< 0.1%
 
4811< 0.1%
 
4781< 0.1%
 
4371< 0.1%
 

N_sesiones
Real number (ℝ≥0)

HIGH CORRELATION
SKEWED

Distinct225
Distinct (%)0.1%
Missing3
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean5.314771201
Minimum1
Maximum6346
Zeros0
Zeros (%)0.0%
Memory size2.8 MiB
2022-05-15T04:49:20.515938image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3
Q13
median3
Q35
95-th percentile16
Maximum6346
Range6345
Interquartile range (IQR)2

Descriptive statistics

Standard deviation17.345244
Coefficient of variation (CV)3.263591855
Kurtosis65626.75611
Mean5.314771201
Median Absolute Deviation (MAD)0
Skewness218.6421765
Sum1954252
Variance300.8574895
MonotocityNot monotonic
2022-05-15T04:49:20.943715image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
322278860.6%
 
53832710.4%
 
4340099.2%
 
6141203.8%
 
7115843.2%
 
858691.6%
 
954741.5%
 
1036921.0%
 
1126580.7%
 
1323350.6%
 
Other values (215)268467.3%
 
ValueCountFrequency (%) 
14< 0.1%
 
221960.6%
 
322278860.6%
 
4340099.2%
 
53832710.4%
 
ValueCountFrequency (%) 
63461< 0.1%
 
40171< 0.1%
 
36961< 0.1%
 
31541< 0.1%
 
20331< 0.1%
 

díasHastaCliente
Real number (ℝ)

Distinct1312
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1218.82716
Minimum-322
Maximum1595
Zeros1395
Zeros (%)0.4%
Memory size2.8 MiB
2022-05-15T04:49:21.353790image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum-322
5-th percentile900
Q11028
median1217
Q31423
95-th percentile1551
Maximum1595
Range1917
Interquartile range (IQR)395

Descriptive statistics

Standard deviation235.0874545
Coefficient of variation (CV)0.1928800589
Kurtosis2.544942124
Mean1218.82716
Median Absolute Deviation (MAD)197
Skewness-0.7623160509
Sum448168841
Variance55266.11126
MonotocityNot monotonic
2022-05-15T04:49:21.750238image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
013950.4%
 
15239760.3%
 
15169670.3%
 
15179640.3%
 
15399480.3%
 
15249400.3%
 
14539370.3%
 
14489320.3%
 
14129310.3%
 
14909280.3%
 
Other values (1302)35778797.3%
 
ValueCountFrequency (%) 
-3221< 0.1%
 
-3201< 0.1%
 
-3131< 0.1%
 
-3021< 0.1%
 
-2911< 0.1%
 
ValueCountFrequency (%) 
159545< 0.1%
 
15944730.1%
 
15938040.2%
 
15922800.1%
 
15913150.1%
 

Interactions

2022-05-15T04:48:22.718183image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:48:23.167336image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:48:23.602391image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:48:24.011582image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:48:24.430249image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:48:24.831978image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:48:25.245564image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:48:25.663983image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:48:26.081696image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:48:26.661063image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:48:27.085728image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:48:27.502004image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:48:27.997180image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:48:28.428730image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:48:28.853969image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:48:29.298102image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:48:29.723692image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:48:30.130437image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:48:30.531312image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:48:30.930566image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:48:31.340630image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:48:31.737073image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:48:32.153532image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:48:32.560496image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:48:32.976049image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:48:33.394249image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:48:33.809513image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:48:34.219371image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:48:34.632484image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:48:35.027747image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:48:35.445711image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:48:35.896129image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:48:36.327858image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:48:36.738279image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:48:37.145197image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:48:37.541260image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:48:38.076761image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:48:38.512212image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:48:38.911042image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:48:39.325171image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:48:39.747522image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:48:40.148543image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:48:40.554179image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:48:40.960368image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:48:41.376523image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:48:41.772985image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:48:42.174194image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:48:42.593652image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:48:43.049004image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:48:43.478990image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:48:43.872968image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:48:44.280330image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:48:44.677973image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:48:45.071250image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:48:45.467985image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:48:45.869177image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:48:46.291289image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:48:46.708985image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:48:47.134023image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:48:47.555027image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:48:47.975715image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:48:48.390389image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:48:48.813115image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:48:49.238416image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Correlations

2022-05-15T04:49:22.111970image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-05-15T04:49:22.782472image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-05-15T04:49:23.254766image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-05-15T04:49:23.901243image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2022-05-15T04:49:24.549589image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2022-05-15T04:48:51.092328image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:48:53.672633image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:48:57.861148image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-15T04:48:58.820150image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Sample

First rows

IDUSUARIOTIPOUSUARIOFEC_REGISTROCANAL_REGISTROIND_CLIENTEIND_ALTATIPOEMAILBONDAD_EMAILUSU_TELFIPCASOSIP_CountryIP_AreaUSU_TIPOUSU_TAMANIOUSU_CIIUUSU_ESTADOUSU_DEPARTAMENTOFEC_CLIENTEFEC_ALTAFicha BásicaPerfil PromocionalN_loginsN_sesionesdíasHastaCliente
08107310PF2019-10-223.000yahoo.com0233XXXXX1ColombiaAmericaNaNNaNNaNNaNNaNNaTNaT0.01.01.05.0936
17784565PJ2019-05-143.000gmail.com20633XXXXX1ColombiaAmericaSOCIEDAD COMERCIAL/INDUSTRIALPQN7820ACTIVAQUINDIONaTNaT0.01.03.03.01097
27718778PJ2019-09-047.000hotmail.com20533XXXXX1ColombiaAmericaSOCIEDAD COMERCIAL/INDUSTRIALMCG4774ACTIVAATLANTICONaTNaT0.01.00.03.0984
37952765PX2019-12-083.000uqvirtual.edu.co20633XXXXX1ColombiaAmericaNaNNaNNaNNaNNaNNaTNaT0.01.01.03.0889
47855424PJ2019-06-217.000hotmail.com20533XXXXX1ColombiaAmericaEMPRESARIO INDIVIDUALMCN8299CANCELACIÓNATLANTICONaTNaT0.01.00.03.01059
58031418PX2019-09-183.000gmail.com20633XXXXX2ColombiaAmericaNaNNaNNaNNaNNaNNaTNaT1.01.03.05.0970
68189769PJ2019-11-273.000yahoo.com20233XXXXX1ColombiaAmericaHOLDINGMCM7010ACTIVAVALLENaTNaT0.01.01.05.0900
77658143PJ2019-12-032.000marval.com.co2063XXXXX4ColombiaAmericaSOCIEDAD COMERCIAL/INDUSTRIALGRF4111ACTIVASANTANDERNaTNaT0.01.00.03.0894
87970569PJ2019-08-217.000gmail.com20533XXXXX1ColombiaAmericaSOCIEDAD COMERCIAL/INDUSTRIALMCG4752ACTIVAATLANTICONaTNaT0.01.00.03.0998
97594738PJ2019-02-147.000amparoldeb.com20633XXXXX1ColombiaAmericaENTIDAD FINANCIERA O DE SEGUROSMCK6621ACTIVARISARALDANaTNaT0.01.02.03.01186

Last rows

IDUSUARIOTIPOUSUARIOFEC_REGISTROCANAL_REGISTROIND_CLIENTEIND_ALTATIPOEMAILBONDAD_EMAILUSU_TELFIPCASOSIP_CountryIP_AreaUSU_TIPOUSU_TAMANIOUSU_CIIUUSU_ESTADOUSU_DEPARTAMENTOFEC_CLIENTEFEC_ALTAFicha BásicaPerfil PromocionalN_loginsN_sesionesdíasHastaCliente
3676958188180PF2019-11-271.000hotmail.com0NaN3225ColombiaAmericaNaNNaNNaNNaNNaNNaTNaT1.01.01.04.0900
3676968103216PF2019-10-211.000gmail.com20NaN4ColombiaAmericaNaNNaNNaNNaNNaNNaTNaT2.02.05.011.0937
3676978205258PF2019-05-121.000gmail.com0NaN373ColombiaAmericaNaNNaNNaNNaNNaNNaTNaT13.05.03.034.01099
3676988108800PF2019-10-231.000hotmail.com20NaN1806ColombiaAmericaNaNNaNNaNNaNNaNNaTNaT2.01.03.09.0935
3676998136973PF2019-05-111.000hotmail.com0NaN1806ColombiaAmericaNaNNaNNaNNaNNaNNaTNaT11.05.04.025.01100
3677008141168PF2019-06-111.000hotmail.com0NaN1806ColombiaAmericaNaNNaNNaNNaNNaNNaTNaT3.02.01.09.01069
3677018147354PF2019-08-111.000hotmail.com0NaN1806ColombiaAmericaNaNNaNNaNNaNNaNNaTNaT5.04.05.014.01008
3677028153565PF2019-12-111.000hotmail.com0NaN1806ColombiaAmericaNaNNaNNaNNaNNaNNaTNaT5.03.03.014.0886
3677038169002PF2019-11-181.000gesticobranzas.com9NaN1806ColombiaAmericaNaNNaNNaNNaNNaNNaTNaT3.01.01.08.0909
3677048205187PF2019-05-121.000hotmail.com0NaN1806ColombiaAmericaNaNNaNNaNNaNNaNNaTNaT5.04.03.015.01099